Beyond NGS data sharing and towards open science
نویسندگان
چکیده
Biosciences have been revolutionized by next generation sequencing (NGS) technologies in last years, leading to new perspectives in medical, industrial and environmental applications. And although our motivation comes from biosciences, the following is true for many areas of science: published results are usually hard to reproduce either because data is not available or tools are not readily available, which delays the adoption of new methodologies and hinders innovation. Our focus is on tool readiness and pipelines availability. Even though most tools are freely available, pipelines for data analysis are in general barely described and their configuration is far from trivial, with many parameters to be tuned. In this paper we discuss how to effectively build and use pipelines, relying on state of the art computing technologies to execute them without users need to configure, install and manage tools, servers and complex workflow management systems. We perform an in depth comparative analysis of state of the art frameworks and systems. The NGSPipes framework is proposed showing that we can have public pipelines ready to process and analyse experimental data, produced for instance by high-throughput technologies, but without relying on centralized servers or Web services. The NGSPipes framework and underlying architecture provides a major step towards open science and true collaboration in what concerns tools and pipelines among computational biology researchers and practitioners. We show that it is possible to execute data analysis pipelines in a decentralized and platform independent way. Approaches like the one proposed are crucial for archiving and reusing data analysis pipelines at medium/long-term. NGSPipes framework is freely available at http://ngspipes.github.io/.
منابع مشابه
Beyond Cross- Cultural Philosophy: Towards a New Enlightenment
The acculturalization of humanities from the late 1980ies onwards led not only to imagined different worlds (e.g. West / Islam), postmodernity overshadowed also common grounds of world`s philosophies. Christianity and Islam share far more than what might separate them, and we find Islam in „the West “as Christianity „in the East“. The Logos of Life Philosophy as developed by Anna-Teresa Tymieni...
متن کاملBiobankCloud: A Platform for the Secure Storage, Sharing, and Processing of Large Biomedical Data Sets
Biobanks store and catalog human biological material that is increasingly being digitized using next-generation sequencing (NGS). There is, however, a computational bottleneck, as existing software systems are not scalable and secure enough to store and process the incoming wave of genomic data from NGS machines. In the BiobankCloud project, we are building a Hadoop-based platform for the secur...
متن کاملA Multi-commodity Pickup and Delivery Open-tour m-TSP Formulation for Bike Sharing Rebalancing Problem
Bike sharing systems (BSSs) offer a mobility service whereby public bikes, located at different stations across an urban area, are available for shared use. An important point is that the distribution of rides between stations is not uniformly distributed and certain stations fill up or empty over time. These empty and full stations lead to demand for bikes and return boxes that cannot be fulfi...
متن کاملSharing code
Sharing code is becoming increasingly important in the wake of Open Science. In this review I describe and compare two popular code-sharing utilities, GitHub and Open Science Framework (OSF). GitHub is a mature, industry-standard tool but lacks focus towards researchers. In comparison, OSF offers a one-stop solution for researchers but a lot of functionality is still under development. I conclu...
متن کاملOpen Science and eGEMs: Our Role in Supporting a Culture of Collaboration in Learning Health Systems
"Open science" includes a variety of approaches to facilitate greater access to data and the information produced by processes of scientific inquiry. Recently, the health sciences community has been grappling with the issue of potential pathways and models to achieve the goals of open science-namely, to create and rapidly share reproducible health research. eGEMs' continued dedication to and mi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1701.03507 شماره
صفحات -
تاریخ انتشار 2016